10.6 Microbiome Single-Omics Quick-Start Example
This section demonstrates a complete and commonly used analytical workflow for microbiome single-omics data using example datasets.
Example data download: Github link
10.6.1 Importing Microbiome Data
library(EasyMultiProfiler)
meta_data <- read.table('coldata.txt',header = T,row.names = 1)
data <- read.table('tax.txt',header = T,sep = '\t')
MAE <- EMP_easy_import(data = data,coldata = meta_data,type = 'tax')
10.6.2 Exploring Microbiome Data
View Current Microbiome Assay
MAE |>
EMP_assay_extract() # View expression matrix
MAE |>
EMP_coldata_extract() # View phenotype data
MAE |>
EMP_rowdata_extract() # View taxonomic annotations
View Phylum-Level Data
MAE |>
EMP_assay_extract() |>
EMP_collapse(collapse_by = 'row',estimate_group = 'Phylum')
View Class-Level Data
MAE |>
EMP_assay_extract() |>
EMP_collapse(collapse_by = 'row',estimate_group = 'Class')
10.6.3 Rarefaction (Optional)
Rarefy Using the Smallest Read Count Among Samples
MAE |>
EMP_assay_extract() |>
EMP_rrarefy()
Rarefy with a Custom Minimum Read Count
MAE |>
EMP_assay_extract() |>
EMP_rrarefy(raresize = 5000)
10.6.4 Data Normalization
Convert to Relative Abundance at Genus Level
MAE |>
EMP_assay_extract() |>
EMP_collapse(collapse_by = 'row',estimate_group = 'Genus') |>
EMP_decostand(method = 'relative')
Apply CLR Transformation at Genus Level
MAE |>
EMP_assay_extract() |>
EMP_collapse(collapse_by = 'row',estimate_group = 'Genus') |>
EMP_decostand(method = 'clr')
Apply Log2 Transformation at Genus Level
MAE |>
EMP_assay_extract() |>
EMP_collapse(collapse_by = 'row',estimate_group = 'Genus') |>
EMP_decostand(method = 'log2+1')
10.6.5 Batch Effect Correction (Optional)
Correct for Batch Effects by Region at Genus Level
MAE |>
EMP_assay_extract() |>
EMP_collapse(collapse_by = 'row',estimate_group = 'Genus') |>
EMP_adjust_abundance(.factor_unwanted = 'Region',
.factor_of_interest = 'Group',
method = 'combat_seq')
10.6.6 Core Microbiome Identification
Identify Core Genera with Minimum Abundance 0.001 and Prevalence >70% in at Least One Group
MAE |>
EMP_assay_extract() |>
EMP_collapse(collapse_by = 'row',estimate_group = 'Genus') |>
EMP_identify_assay(estimate_group = 'Group',method = 'default',
min=0.001,min_ratio = 0.7)
10.6.7 Alpha Diversity Analysis
Calculate Alpha Diversity for Core Genera and Visualize with Boxplot
MAE |>
EMP_assay_extract() |>
EMP_collapse(collapse_by = 'row',estimate_group = 'Genus') |>
EMP_identify_assay(estimate_group = 'Group',method = 'default',
min=0.001,min_ratio = 0.7) |>
EMP_alpha_analysis() |>
EMP_boxplot(estimate_group = 'Group')
10.6.8 Beta Diversity Analysis
Calculate Beta Diversity for Core Genera and Generate Ordination Plot
MAE |>
EMP_assay_extract() |>
EMP_collapse(collapse_by = 'row',estimate_group = 'Genus') |>
EMP_identify_assay(estimate_group = 'Group',method = 'default',
min=0.001,min_ratio = 0.7) |>
EMP_dimension_analysis(method = 'pcoa',distance = 'bray') |>
EMP_scatterplot(estimate_group = 'Group',show='p12html')
10.6.9 Differential Abundance Analysis
Perform Wilcoxon Test and Filter Significant Genera
MAE |>
EMP_assay_extract() |>
EMP_collapse(collapse_by = 'row',estimate_group = 'Genus') |>
EMP_diff_analysis(method = 'wilcox.test',estimate_group = 'Group') |>
EMP_filter(feature_condition = pvalue < 0.05)
Perform DESeq2 Analysis and Filter Significant Genera
MAE |>
EMP_assay_extract() |>
EMP_collapse(collapse_by = 'row',estimate_group = 'Genus') |>
EMP_diff_analysis(method = 'DESeq2',.formula = ~Group) |>
EMP_filter(feature_condition = pvalue < 0.05,keep_result = TRUE)
10.6.10 Machine Learning for Key Taxa
The EMP package includes built-in methods for feature selection: Boruta, Random Forest, XGBoost, and Lasso. For detailed usage, run help(EMP_marker_analysis).
Identify Important Genera Using Boruta and Visualize with Heatmap
MAE |>
EMP_assay_extract() |>
EMP_collapse(collapse_by = 'row',estimate_group = 'Genus') |>
EMP_identify_assay(estimate_group = 'Group',method = 'default',
min=0.001,min_ratio = 0.7) |>
EMP_marker_analysis(method = 'boruta',estimate_group = 'Group') |>
EMP_filter(feature_condition = Boruta_decision!= 'Rejected') |>
EMP_heatmap_plot(palette='Spectral',legend_bar='auto',
clust_row=TRUE,clust_col=TRUE)
Select Height-Associated Genera Using Lasso and Plot Group-Level Heatmap
MAE |>
EMP_assay_extract() |>
EMP_collapse(collapse_by = 'row',estimate_group = 'Species') |>
EMP_identify_assay(estimate_group = 'Group',method = 'default',
min=0.001,min_ratio = 0.7) |>
EMP_marker_analysis(method = 'lasso',estimate_group = 'Height') |>
EMP_filter(feature_condition = lasso_coe > 0) |>
EMP_collapse(method = 'mean',estimate_group = 'Group',
collapse_by = 'col') |>
EMP_heatmap_plot(palette='Spectral',legend_bar='auto')
10.6.11 Correlation with Phenotypes
Correlation Heatmap Between Phyla and Phenotypic Variables
phylum_data <- MAE |>
EMP_assay_extract() |>
EMP_collapse(collapse_by = 'row',estimate_group = 'Phylum')
meta_data <- MAE |>
EMP_coldata_extract(action = 'add')
(phylum_data + meta_data) |>
EMP_cor_analysis() |>
EMP_heatmap_plot()
10.6.12 Linear Regression with Phenotypes
Linear Fit Between Genus Blautia and BMI
MAE |>
EMP_assay_extract() |>
EMP_collapse(collapse_by = 'row',estimate_group = 'Genus') |>
EMP_fitline_plot(var_select=c('Blautia','BMI'))
10.6.13 Network Analysis
Network Plot Using Differentially Abundant Genera and Selected Phenotypes
MAE |>
EMP_assay_extract() |>
EMP_collapse(estimate_group = 'Genus',collapse_by = 'row') |>
EMP_diff_analysis(method='wilcox.test', estimate_group = 'Group') |>
EMP_filter(feature_condition = pvalue<0.05) |>
EMP_network_analysis(coldata_to_assay = c('BMI','PHQ9','GAD7')) |>
EMP_network_plot(node_info = 'Phylum',label.cex = 1,edge.labels = TRUE)